<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Guide for developing applications With Terrier</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<link rel="stylesheet" type="text/css" charset="utf-8" media="all" href="docs.css">
</head>

<body>
<!--!bodystart-->
[<a href="properties.html">Previous: List of Terrier properties</a>] [<a href="index.html">Contents</a>] [<a href="extend_indexing.html">Next: Extending Indexing</a>]
<table width="100%">
  <tr> 
    <td width="82%" valign="bottom"><h1>Developing Applications with Terrier</h1></td>
	<!--!bodyremove-->
    <td width="18%"><a href="http://ir.dcs.gla.ac.uk/terrier/"><img src="images/terrier-logo-web.jpg" border="0"></a></td>
	<!--!/bodyremove-->
  </tr>
</table>
<p align="justify">Terrier provides APIs for <a href="extend_indexing.html">indexing</a> documents, and <a href="extend_retrieval.html">querying</a> 
  the generated indices. If you are developing applications using Terrier or extending it for your own research, then 
  you may find the following information useful.</p>

<h2>Extending Terrier</h2>
<p align="justify">Terrier has a very flexible and modular architecture, with many classes, some with various alternatives. It is very easy to change many parts of the indexing and retrieval process. Essential to any in-depth extension of Terrier is to examine the very many <a href="properties.html">properties</a> that can be configured in Terrier. For instance, if you write a new Matching class, you can use this in a TREC-like setting by setting the property <tt>trec.matching</tt>, while if you write a new document weighting model you should set the property <tt>trec.model</tt> to use it, or add it in <tt>etc/trec.models</tt>. For more information about extending the retrieval functionalities of Terrier, see <a href="extend_retrieval.html">Extending Retrieval</a>, and <a href="extend_indexing.html">Extending Indexing</a> for more information about the indexing process Terrier uses.</p>

<h3>FileSystem Abstraction Layer</h3>
<p align="justify">All File IO in Terrier (excluding the Desktop application and Terrier configuration) is performed using the <a href="javadoc/uk/ac/gla/terrier/utility/Files.html">Files</a> class. This affords various opportunities for allowing Terrier to run in various environments. In Terrier 2.1, a FileSystem abstraction layer was integrated into the Files class, such that other <a href="javadoc/uk/ac/gla/terrier/utility/io/FileSystem.html">FileSystem</a> implementations could be plugged in. By default, Terrier ships with two implementation, namely <a href="javadoc/uk/ac/gla/terrier/utility/io/LocalFileSystem.html">LocalFileSystem</a> for reading the local file system using the Java API, and <a href="javadoc/uk/ac/gla/terrier/utility/io/HTTPFileSystem.html">HTTPFileSystem</a> for reading files accessible by HTTP or HTTPS protocols. A filename is searched for a prefixing scheme (eg "file://"), similar to a URI or URL. If a scheme is detected, then Terrier will search through its known file system implementations for a file system supporting the found scheme. file:// is the default scheme if no scheme can be found in the filename; if the filename starts http://, then the file will be fetched by HTTP.

<p align="justify">The Files layer can also transform paths to filenames on the fly. For example, if a certain HTTP namespace is accessible as a local file system, the Files layer can be informed using <tt>Files.addPathTransformation()</tt>. </p>

<p align="justify">Additional implementations can implement methods of the FileSystem interface that they support, and register themselves by calling the <tt>Files.addFileSystemCapability()</tt> method. The FileSystem denotes the operations it supports on a file or path by returning the bit-wise OR of the constants named in Files.FSCapability.  </p>



<h2>Compiling Terrier</h2>
<p align="justify">The main Terrier distribution comes pre-compiled as Java, and can be run on
any Java 1.5 JDK. You should have no need to compile Terrier unless:</p>
<ul>
  <li>You have altered the Terrier source code and wish to check or use your changes.</li>
  <li>You want to browse source code of the query parser related classes, which are automatically generated by compiling the grammar specifications with ANTLR.</li>
</ul>

<p align="justify">Terrier is distributed with two scripts for compiling Terrier for Unix-like
platforms:</p>
<ul>
<li><tt>bin/compile.sh</tt> &amp; <tt>bin/compile.bat</tt> : This builds the terrier-$VERSION.jar file and puts it in the lib/
    folder. It will compile all files it finds in the src/ folder.</li>

<li><tt>Makefile</tt> : This is a classic Makefile for building Terrier, and is more
    maintained than bin/compile.sh. It has many targets:
        <ul>
        <li>clean - removes all build process files</li>
        <li>compile - builds Terrier query parser, Terrier jar file and
                       new Javadoc. It only builds compiles and includes
                   java files that are in the MANIFEST.txt, into the jar file lib/terrier-$VERSION.jar</li>
        <li>doc javadoc  - builds the Javadoc</li>
        <li>distribution - builds the currently selected target platform
                       distribution file (eg terrier-$VERSION.tar.gz,
                       terrier-$VERSION.zip)</li>
        <li>unix         - builds terrier-$VERSION.tar.gz</li>
        <li>win          - build terrier-$VERSION.zip, and runs all text files
                       through unix2dos</li>
		<li>test - runs the test harness. </li>
        </ul>
</li>
</ul>
<p align="justify"><b>NB:</b>Currently we suggest that you use the Makefile instead of the script bin/compile.(sh,bat), and that you always execute <tt>make clean compile</tt> to compile Terrier. This ensures that the TerrierParser is always built correctly.</p>

<p align="justify"><i>There are files missing from the Terrier source code?</i> This is correct as in the source for some classes aren't included. This is because these files are generated automatically by Antlr during the compiling process. The build process invokes the Antlr "compiler compiler", which generates the missing Java source files from the queryparser specification (the .g files in src/uk/ac/gla/terrier/querying/parser/).</p>

<p align="justify">If you want to compile your application code that uses functionalities of Terrier, then it is preferable to make the compilation having the file lib/terrier-2.X.jar in your classpath, instead of the folder src/.</p>

<p align="justify">If you use the <a href="http://www.eclipse.org">Eclipse IDE</a>, then you can get it to correctly compile Terrier by installing the <a href="http://antlreclipse.sourceforge.net/">Antlr Eclipse Plugin</a>.</p>

<p align="justify"><i>How do I run the test harness?</i> On the command line, run <tt>make test</tt>. There are many tests for various indexing and retrieval functionalities. In all cases, Mean Average Precision (MAP) should be 1.0000. If this it not the case, check test.log to see what went wrong. The test harness will be extended as Terrier matures.

<p></p>
[<a href="properties.html">Previous: List of Terrier properties</a>] [<a href="index.html">Contents</a>] [<a href="extend_indexing.html">Next: Extending Indexing</a>]
<!--!bodyend-->
<hr>
<small>
Webpage: <a href="http://ir.dcs.gla.ac.uk/terrier">http://ir.dcs.gla.ac.uk/terrier</a><br>
Contact: <a href="mailto:terrier@dcs.gla.ac.uk">terrier@dcs.gla.ac.uk</a><br>
<a href="http://www.dcs.gla.ac.uk/">Department of Computing Science</a><br>

Copyright (C) 2004-2008 <a href="http://www.gla.ac.uk/">University of Glasgow</a>. All Rights Reserved.
</small>
</body>
</html>
